A Resizable Mini-batch Gradient Descent based on a Multi-Armed Bandit
نویسندگان
چکیده
Determining the appropriate batch size for mini-batch gradient descent is always time consuming as it often relies on grid search. This paper considers a resizable mini-batch gradient descent (RMGD) algorithm based on a multi-armed bandit for achieving best performance in grid search by selecting an appropriate batch size at each epoch with a probability defined as a function of its previous success/failure. This probability encourages exploration of different batch size and then later exploitation of batch size with history of success. At each epoch, the RMGD samples a batch size from its probability distribution, then uses the selected batch size for mini-batch gradient descent. After obtaining the validation loss at each epoch, the probability distribution is updated to incorporate the effectiveness of the sampled batch size. The RMGD essentially assists the learning process to explore the possible domain of the batch size and exploit successful batch size. Experimental results show that the RMGD achieves performance better than the best performing single batch size. Furthermore, it, obviously, attains this performance in a shorter amount of time than grid search. It is surprising that the RMGD achieves better performance than grid search.
منابع مشابه
Exponentiated Gradient LINUCB for Contextual Multi-Armed Bandits
We present Exponentiated Gradient LINUCB, an algorithm for contextual multi-armed bandits. This algorithm uses Exponentiated Gradient to find the optimal exploration of the LINUCB. Within a deliberately designed offline simulation framework we conduct evaluations with real online event log data. The experimental results demonstrate that our algorithm outperforms surveyed algorithms.
متن کاملOnline Learning with Partial Feedback
In previous lectures we talked about the general framework of online convex optimization and derived an algorithm for prediction with expert advice from this general framework. To apply the online algorithm, we need to know the gradient of the loss function at the end of each round. In the prediction of expert advice setting, this boils down to knowing the cost of each individual expert. In thi...
متن کاملAn Empirical Analysis of Bandit Convex Optimization Algorithms
We perform an empirical analysis of bandit convex optimization (BCO) algorithms. We motivate and introduce multi-armed bandits, and explore the scenario where the player faces an adversary that assigns different losses. In particular, we describe adversaries that assign linear losses as well as general convex losses. We then implement various BCO algorithms in the unconstrained setting and nume...
متن کاملGap-free Bounds for Stochastic Multi-Armed Bandit
We consider the stochastic multi-armed bandit problem with unknown horizon. We present a randomized decision strategy which is based on updating a probability distribution through a stochastic mirror descent type algorithm. We consider separately two assumptions: nonnegative losses or arbitrary losses with an exponential moment condition. We prove optimal (up to logarithmic factors) gap-free bo...
متن کاملDifferentiating the Multipoint Expected Improvement for Optimal Batch Design
This work deals with parallel optimization of expensive objective functions which are modelled as sample realizations of Gaussian processes. The study is formalized as a Bayesian optimization problem, or continuous multi-armed bandit problem, where a batch of q > 0 arms is pulled in parallel at each iteration. Several algorithms have been developed for choosing batches by trading off exploitati...
متن کامل